stochastic variational deep kernel learning
Stochastic Variational Deep Kernel Learning
Deep kernel learning combines the non-parametric flexibility of kernel methods with the inductive biases of deep learning architectures. We propose a novel deep kernel learning model and stochastic variational inference procedure which generalizes deep kernel learning approaches to enable classification, multi-task learning, additive covariance structures, and stochastic gradient training. Specifically, we apply additive base kernels to subsets of output features from deep neural architectures, and jointly learn the parameters of the base kernels and deep network through a Gaussian process marginal likelihood objective. Within this framework, we derive an efficient form of stochastic variational inference which leverages local kernel interpolation, inducing points, and structure exploiting algebra. We show improved performance over stand alone deep networks, SVMs, and state of the art scalable Gaussian processes on several classification benchmarks, including an airline delay dataset containing 6 million training points, CIFAR, and ImageNet.
- Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.61)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.61)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)
Stochastic Variational Deep Kernel Learning
Deep kernel learning combines the non-parametric flexibility of kernel methods with the inductive biases of deep learning architectures. We propose a novel deep kernel learning model and stochastic variational inference procedure which generalizes deep kernel learning approaches to enable classification, multi-task learning, additive covariance structures, and stochastic gradient training. Specifically, we apply additive base kernels to subsets of output features from deep neural architectures, and jointly learn the parameters of the base kernels and deep network through a Gaussian process marginal likelihood objective. Within this framework, we derive an efficient form of stochastic variational inference which leverages local kernel interpolation, inducing points, and structure exploiting algebra. We show improved performance over stand alone deep networks, SVMs, and state of the art scalable Gaussian processes on several classification benchmarks, including an airline delay dataset containing 6 million training points, CIFAR, and ImageNet.
- Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)
Reviews: Stochastic Variational Deep Kernel Learning
This is a well-written paper with only a small number of grammatical and presentation errors, e.g. However, the experimental section lacks some detail on the training of DNNs and the pre-training. From the level of detail given it will be hard to reproduce the results. Furthermore the results are not too convincing. The authors claim that "flexible deep kernel GPs achieve superior performance" to DNNs and SVMs.
Parallel and Limited Data Voice Conversion Using Stochastic Variational Deep Kernel Learning
Jafaryani, Mohamadreza, Sheikhzadeh, Hamid, Pourahmadi, Vahid
Typically, voice conversion is regarded as an engineering problem with limited training data. The reliance on massive amounts of data hinders the practical applicability of deep learning approaches, which have been extensively researched in recent years. On the other hand, statistical methods are effective with limited data but have difficulties in modelling complex mapping functions. This paper proposes a voice conversion method that works with limited data and is based on stochastic variational deep kernel learning (SVDKL). At the same time, SVDKL enables the use of deep neural networks' expressive capability as well as the high flexibility of the Gaussian process as a Bayesian and non-parametric method. When the conventional kernel is combined with the deep neural network, it is possible to estimate non-smooth and more complex functions. Furthermore, the model's sparse variational Gaussian process solves the scalability problem and, unlike the exact Gaussian process, allows for the learning of a global mapping function for the entire acoustic space. One of the most important aspects of the proposed scheme is that the model parameters are trained using marginal likelihood optimization, which considers both data fitting and model complexity. Considering the complexity of the model reduces the amount of training data by increasing the resistance to overfitting. To evaluate the proposed scheme, we examined the model's performance with approximately 80 seconds of training data. The results indicated that our method obtained a higher mean opinion score, smaller spectral distortion, and better preference tests than the compared methods.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Asia > Japan (0.04)
Stochastic Variational Deep Kernel Learning
Wilson, Andrew G., Hu, Zhiting, Salakhutdinov, Russ R., Xing, Eric P.
Deep kernel learning combines the non-parametric flexibility of kernel methods with the inductive biases of deep learning architectures. We propose a novel deep kernel learning model and stochastic variational inference procedure which generalizes deep kernel learning approaches to enable classification, multi-task learning, additive covariance structures, and stochastic gradient training. Specifically, we apply additive base kernels to subsets of output features from deep neural architectures, and jointly learn the parameters of the base kernels and deep network through a Gaussian process marginal likelihood objective. Within this framework, we derive an efficient form of stochastic variational inference which leverages local kernel interpolation, inducing points, and structure exploiting algebra. We show improved performance over stand alone deep networks, SVMs, and state of the art scalable Gaussian processes on several classification benchmarks, including an airline delay dataset containing 6 million training points, CIFAR, and ImageNet.
- Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)